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We discuss a quantizer which, for every new input sample, adapts its 
step-size by a factor depending only on the knowledge of which quantizer 
slot was occupied by the previous signal sample. 1 Specifically, if the out- 
puts of a uniform B-bit quantizer (B > 1) are of the form 

Y U = P U ^; ±P« = 1,3, • -,2«-l; A u >0, 

the step-size A r is given by the previous step-size multiplied by a time- 
invariant function of the code-word magnitude |P r -i| : 

The adaptations are motivated by the assumption that the input signal 
variance is unknown, so that the quantizer is started off, in general, with a 
suboptimal step-size Astart. Multiplier functions that maximize the 
signal-to-quantization-error ratio (SNR) depend, in general, on Astart 
and the input sequence length N. For example, if the signal is stationary 
and N—*°°, best multipliers, irrespective of Astart, have values arbi- 
trarily close to unity. On the other hand, small values of N and suboptimal 
values of Astart necessitate M values further away from unity. By 
including an adequate range of values for N and Astart in a generalized 
SNR definition, we show how one can determine stable multiplier functions 
Mopt that are optimal for a given signal. 

In computer simulations of 2- and 3-bit quantizers with first-order 
Gauss-Markovian inputs, we note that, except when the magnitude of the 
correlation C between adjacent samples is very high, A/opt has the property 
of calling for fast increases and slow decreases of step-size. We derive 
optimum multipliers theoretically for two simple cases: 

M OPT = |"1 + JP pl_l* + «I(|P r _ 1 |); C = 

1119 
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M? PT = J|Er + « 2 (|iVi|); c->i. 

if is a constant depending only on B, and 5 2 is a positive correction that is 
significant only for the last slot: \P T -i\ = 2 B — 1. Using the example of 
C = 0, we also show how the approach of specifying P T -i, explicitly, in 
the determination of A r , is more effective than an earlier procedure 2 where 
A r is determined by past output values F r _i (rather than by a function of 
their components, P r _i and A r _i). 

Computer simulations with speech and picture signals have shown, once 
again, that SNR-maximizing multiplier functions demand step-size in- 
creases that are relatively faster than step-size decreases. Values of Mopt 
depend, interestingly, on whether the quantizer is used in a PCM or a 
DPCM-type coder. In the case of speech signals, we propose corresponding 
tables of Mopt values for B = 2, S, 4, o-nd 5. DPCM coding of speech 
with 3- and 4-bit adaptive quantizers is the subject of a companion paper. 1 

I. INTRODUCTION 

Quantization error, in general, can take one of two distinct forms, 
overload distortion or granular noise, reflecting, respectively, situations 
where the quantizer step-size is too small or too large relative to the 
signal being quantized. This distinction has been widely noted for 
1-bit quantizers (delta modulators), and variable step-size quantization 
has therefore been widely discussed in this context. 3-6 The general idea 
is to increase the step-size during overload and decrease it during granu- 
larity, and to detect those conditions on the basis of observations of the 
delta modulator bit stream. The step-size adaptations can be either 
instantaneous 3 ,5 ' 6 or "syllabic," 4 and the advantages of adaptation have 
been shown, among other means, by demonstrations of dynamic range 
and of SNR gains over nonadaptive quantizers. 6 

The problem of step-size adaptations, as applied to quantizers with 
more that two output levels, has been less widely studied. It is con- 
ventional in such quantizers to take signal nonstationarity into account 
by means of a suitably designed, time-invariant, nonuniform quan- 
tizer. 7 Recently, however, two proposals have incorporated time- 
variant step-size logics in multibit quantization. The first of these 
techniques is a syllabically adapting PCM which Wilkinson empirically 
designed for speech encoding at 10 kb/s. 8 The second proposal is an 
instantaneously adapting quantizer discussed by Stroh, in the context 
of differential encoding of Gaussian signals. 2 Syllabic adaption has the 
advantages that it can be better tailored to a given signal such as 
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speech and that it can also be designed to provide better resistance to 
bit errors 4 than instantaneous adaptation. The latter, on the other 
hand, has the advantages of minimal structure and applicability to 
different types of signals, and, in relatively noise-protected environ- 
ments, it constitutes an efficient and simple encoding procedure for 
signal storage or transmission. 

The adaptation that we discuss is instantaneous, and we indicate, 
at the end of this paper, how it can perform better than Stroh's com- 
pandor 2 when working with one word of quantizer (output) memory. 
We must emphasize here that in each case what is being gained by the 
adaptation is increased dynamic range rather than an inherent signal- 
to-noise ratio advantage over a nonadaptive technique. The adaptive 
techniques presuppose that the input signal variance is unknown. The 
quantizer step-size cannot therefore be meaningfully preset to an 
optimized constant value, but must be allowed to adapt itself to signal 
statistics in a fashion determined by a (time-invariant) adaptation 
strategy. 

The specific quantizer configuration that we consider is characterized 
by a uniform spacing of nonzero output levels, and Fig. 1 shows a 
snapshot of the quantizer at sampling instant r for the example of 



OUTPUT (Y) 



| A, 



4 A, 



INPUT(X) 



A r 2A r 3A r 



Fig. 1 — Uniform quantizer with 8 levels (B = 3). 
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B = 3. The step-size A r is adapted, for every new input sample, by a 
factor depending only on the knowledge of which quantizer slot was 
occupied by the previous signal sample. More precisely, if the outputs 
of a B-bit quantizer (B > 1) are of the form 

F u = P u y; P« =±1,3, ...,2»-l; A u > 0, (1) 

the step-size A r is given by the previous step-size multiplied by a time- 
invariant function of the previous code-word magnitude | P T -i \ '. 

A r = Ar-i-JfClPr-il). (2) 

Note that, according to (2), the entire quantizer is "accordioned in" 
when M < 1 and stretched out when M > 1. The resulting quantizer 
is also uniform, with a step-size or "slot width" equal to A r . Practical 
implementations will also include upper and lower limits A max and 
A M in for A r . This is discussed later in the paper. 

The above logic has been recently employed 1 for efficient differential 
encoding of speech signals at bit rates of 20 to 30 kb/s. The adaptation 
strategy (2) is indeed arbitrary.* But it represents, in the manner of the 
adaptive delta modulator discussed earlier by the author, 6 a very 
simple, yet nontrivial, type of exponential adaptation, and sets a lower 
bound on the performance of possible sophistications that may include 
nonexponential adaptations and the use of longer word memories, i.e., 
the use of P r -2, Pr-», etc. 

An interesting result of this paper is that, for many interesting input 
signals, the step-size multiplier function M(\P\) which minimizes the 
mean-squared quantization error has the interesting property that it 
demands step-size decreases significantly slower than step-size in- 
creases. This is shown to be true for illustrative speech and picture 
signals and for first-order Gauss-Markovian inputs where the magni- 
tude of the correlation between adjacent signals is not too high (say, 
less than 0.9). 

In Section II, we discuss computer simulations with a first-order 
Gauss-Markov input. We discuss the simple case of a white signal 
(C = 0) at length. Results show the dependence of signal-to-quantiza- 
tion-error ratio (SNR) on the function M(\P\) for different values of 
B (number of quantizer bits), N (number of samples in input sequence), 
and A start (initial step-size). We then specify adequate ranges of 

* Schlink has recently described another useful, but perhaps less general, empirical 
system. 9 Here, the adaptation consists in switching between only two quantizing 

characteristics. 
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variation for N and Astart, and thence determine a stable multiplier 
function that is optimal for a white Gaussian signal. Further results 
include the cases of C = 0.5 and 0.99, and show, for B = 2 and 3, 
values of Mopt and SNR gain over a nonadaptive quantizer. We also 
provide illustrative histograms of slot occupancies and observed step- 
sizes and a family of companding curves for a 4-bit quantizer. 

In Section III, we derive optimum multipliers theoretically for the 
examples of C = and C —* 1. Results substantiate the values of 
Mopt from the computer simulation. We also compare our technique 
with that of Stroh 2 and discuss the greater efficacy of our adaptation 
strategy using the example of C = 0. Finally, in Section III, we discuss 
quantizer simulations with speech and picture inputs. We present 
multiplier functions basically similar to those for Gauss-Markov 
inputs. Optimal multipliers are found to be slightly different for PCM 
and DPCM coders. In the case of speech, we provide separate tables of 
Mopt for B = 2, 3, 4, and 5. 

II. GAUSS-MARKOV INPUTS 

Our simulations have employed, as quantizer input, a first-order 
Gauss- Markovian sequence jX r } of 10,000 samples generated by the 
recursive rule 

X u = C-X u ^ + VI - C 2 -N u ; X = 0, (3) 

where the samples N u are drawn from a zero-mean, unit variance, 
white Gaussian sequence that is independent of past values of [X r \. 
The input sequence generated in (3) is itself Gaussian with a mean of 
zero, a variance of unity, and a correlation between adjacent samples 
equal to the preset constant C. 

The quantizer output, by definition, is the output level nearest to 
the input X r . It is formally written as 



H( 2 [tH*}^ 



A 



= {(2* - 1) \ J sgn X T ; ^ fc 2*-', (4) 

where [•] stands for "greatest integer in." 
The quantization error 

E r = Y r - X r (5) 

has a magnitude that is bounded by A/2 except during overload which 
is expressed by the second line in eq. (4). 
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A conventional performance measure is the signal-to-quantization- 
error ratio 

SNR = £^1, (6) 

where summations are assumed to be over the duration of a statistically 
adequate input sequence. 

We also refer in this paper to nonadaptive quantizers for which 

M(|P r _ 1 |) = l; allP r _i 
A r = A ; all r, 

and the variation of signal-to-quantization-error ratio SNRna is a 
function of the constant step-size A for this case. The step-size which 
maximizes SNRna for a nonadaptive quantizer will be referred to as 
the optimum step-size A pt. Values of A pt and the corresponding 
values of SNR NA , for different values of B, have in fact been tabulated 
by Max 10 for the case of C = 0. Max's results also specify (via the 
Gaussian probability density function) the probability P, that the 
sth slot is occupied in an optimized nonadaptive quantizer : 

P. = Prob (P u = 2s - 1) + Prob {-P u = 2s - 1); 
s u = 1, 2, • • • , 2*- 1 , 

where P u is denned by (1). We will see presently that the probability 
P, is also very relevant in the study of an adaptive quantizer when 
C = 0. 

2.1 A General Performance Criterion 

Adaptive quantizers are needed, as mentioned earlier, when non- 
stationary input signals are expected. Our simulations with Gaussian 
signals utilized a stationary input (3) . To make the study of adaptation 
strategies meaningful in this stationary environment, we shall introduce 
some unconventional performance measures. For example, consider 
the ratio 

SNR(tf, Astart) = E X* I £ E 2 r , (9) 

i / i 

where summations are over the first N samples of the input sequence. 
The dependence of SNR on Astart is significant only for small values 
of N. For large N, (9) tends to an asymptotic value that is independent 

of Astart: 

SNR(oo) £ lim SNR(tf, Astart). (10) 
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Fig. 2— Step-size histograms (B = 3, C = 0.5, N = 10,000). 

In fact, if N is sufficiently large, the value of Astart is entirely 
academic in the study of adaptive quantizers. See the step-size histo- 
grams in Fig. 2, for example. Notice how they are independent of 
Astart, except for the flat tails representing transient values of A. 

In adaptive quantization, a suitable multiplier function for a given 
signal should provide a compromise between quickness of response [as 
measured by the magnitude of (9) for small values of N and bad values 
of Astart] and satisfactory steady-state performance [as measured by 
the magnitude of (9) for large values of N and values of Astart close 
to Aopt]. With these opposing factors in mind, we define an average 
performance index 



SNRave = AZ 



, m ^ L SNR(tf, Astart) 

^O jv Astart 



(11) 



for values of N = 10, 100, 1000, and 10,000, and 

Astart = Jq , j= , 1, VKJ, 10 A OPT . 

The target values of N and Astart above have been chosen with the 
following factors in mind : 

(i) First, as mentioned earlier, infinitesimally small ranges of 
values (for example, Astart = Aqpt; any iV) are uninteresting 
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because they can result in Mopt values arbitrarily close to the 
trivial value of unity. 

(it) On the other hand, overly wide ranges of parameters which 
include combinations like (N = 1, A STA rt = 10 4 Aopt) reflect 
pathological situations and lead to multiplier specifications that 
tend to be quite uncorrelated with the statistical nature of the 
signal being quantized. 
(Hi) As long as the extreme situations in (i) and (it) are avoided, it 
has been found that .Mopt values are not overly sensitive to the 
actual N and A start values employed in the performance crite- 
rion (11), but depend mainly on the statistics of the signal being 
encoded. In fact, optimal multipliers in this case are merely the 
best multipliers in a variance-estimating problem (see the 
theory for C = in Section III) that includes neither N nor 
A start as a significant parameter. 

(iv) With the aforementioned factors in mind, the specific values 
of N and A start in (11) were selected to have the following 
significance for a typical application such as speech quantiza- 
tion. First, the 40-dB range for A start reflects an extent of 
uncertainty (about signal power) which is reasonably charac- 
teristic of telephone conversation. 7 Second, when one considers 
Nyquist-sampled speech for applications like adaptive PCM or 
adaptive DPCM, 1 the values of N in (11) correspond at the 
lower end to about 1 millisecond of speech, and at the higher 
end to about 1 second of speech. This range clearly includes the 
range of durations that one may associate with "steady-state" 
or "stationary" segments in the acoustic waveform. In fact, if 
one considers phoneme durations, values of N in the range 100 
to 5000 seem to provide an adequate model. It is our contention 
that by using N values of this type in an index of performance 
such as (11), we can very usefully assess M -functions for 
quantizing locally stationary signals such as speech, even when 
simulating the quantizer with a (standard and easily dupli- 
cated) stationary Gaussian input. Actually, however, we have 
carried out completely independent simulations with real speceh 
signals as well (Section IV), and the results of this section are 
directed toward the quantization of Gaussian inputs as such. 

2.2 Multiplier Functions for B = 2, C = 

Table I illustrates the nature of the SNR function (9) for two 
multiplier functions in a 2-bit quantizer. The first multiplier function 
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Table I — Example of SNR Functions for B = 2, C = 
(Entries in dB) 









Values of N 






201og(^) 


10 




100 


1000 


10,000 




Mi 


= 0.8 




Ms ■■ 


= 1.6 


-20 


6.4 




7.2 


7.4 


7.3 


-10 


10.5 




8.9 


7.9 


7.3 





9.7 




8.3 


7.6 


7.3 


10 


5.8 




7.2 


7.4 


7.2 


20 


-5.9 




4.2 


7.1 


7.3 




Mi 


= 0.98 




Mi -- 


= 1.04 


-20 


1.6 




3.8 


8.4 


9.1 


-10 


5.2 




5.8 


8.9 


9.2 





10.7 




8.0 


9.4 


9.2 


10 


0.0 




5.9 


9.0 


9.2 


20 


-13.2 




-5.0 


3.5 


8.1 



shows quicker response (better SNR values for N = 10 and 100), while 
the second function achieves a better asymptotic value of SNR (at 
N = 10,000). Obviously, the poor asymptotic performance of the first 
M -function is due to overly abrupt step-size oscillations in the "steady- 
state," while the inferior performance of the second M-i unction for 
small N is due to sluggish adaptations of A when Astart is suboptimal. 
Table II compares several ^/-functions* for a 2-bit quantizer on the 
basis of (11). The functions included represent a subset of many more 
functions which were simulated and compared on the basis of SNRave- 
The best value of 6.8 dB has been noted for Mi = 0.80, M 2 = 1.60, 
although this function provides a clearly nonmaximal asymptotic per- 
formance (Table I). The first five functions in Table II also satisfy 

Table II — Comparison of Multiplier Functions (B = 2, C = 0) 



Mi 


.U 2 


SNRave (dB) 


0.71 


2.00 


5.9 


0.80 


1.60 


6.8 


0.90 


1.20 


6.5 


0.95 


1.10 


6.1 


0.98 


1.04 


5.3 


0.95 


1.20 


5.9 


0.50 


2.00 


5.8 


0.90 


1.10 


5.2 



* Whenever there is no scope for confusion, we shall use the symbols Mi, Mi, M 3 , 
and M< instead of il/ r (l), il/ r (3), M,(5), and A/ r (7). 
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the interesting constraint suggested by Goodman: 11 

Ml M 2 ^ M°i 67 M°2 33 S£ Mf l -M? S 1, 



(12) 



where Pi ~ 0.67 and P 2 = 0.33 are the probabilities of inner- and 
outer-slot occupancy in a nonadaptive quantizer with an optimal 
Aopt for the Gaussian input. Goodman conjectures that the prob- 
abilities of using Mi and Mi in a well-designed adaptive quantizer 
should indeed be equal to the parameters Pi and P 2 of the nonadaptive 
quantizer. A constraint of the form (12) then represents a stability 
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Fig. 3— Step-size histograms (B = 2, C = 0, N = 10,000). 
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Fig. 4 — Comparison of multiplier functions (B = 2, C = 0, Astart =0.1 Aopt). 



criterion which specifies that the random step-size A r neither grows out 
of bounds, independently of the input, nor decays to infinitesimal 
values. This criterion has been discussed earlier in the context of 
adaptive delta modulation with a 1-bit memory. 6 

The desirability of constraint (12) on step-size multipliers is also 
demonstrated by the step-size histograms in Fig. 3. The multiplier 
pairs (0.9, 1.2) and (0.71, 2.0) satisfy constraint (12), and the corre- 
sponding histograms have the desirable property that they are centered 
on Aopt although they have different dispersions (suggesting differences 
in quickness of response and steady-state performance). The function 
(0.9, 1.10), on the other hand, produces a histogram whose mode is 
clearly displaced from A 0PT . This suggests that (0.9, 1.10) falls in a 
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Table III— SNR Function for M x = 0.90, M% = 0.90, M 3 = 1.25, 
Mi = 1.75 (B = 3, C = 0, Entries in dB) 



201og( A A 8TART> ) 
\ Aopt / 




Values of N 




















10 


100 


1000 


10,000 


-20 


11.9 


11.2 


12.7 


12.7 


-10 


14.5 


11.8 


12.6 


12.7 





15.7 


11.2 


12.9 


12.7 


+ 10 


11.8 


11.6 


12.5 


12.7 


+20 


-1.6 


8.7 


12.3 


12.7 



class of inefficient multiplier functions; this is attributed to the fact 
that the function (0.9, 1.10) clearly violates requirement (12) above. 11 
Finally, in Fig. 4, we show SNR (6) as a function of N for a fixed 
value of A start, and for different M -functions. It is once again ap- 
parent that the adaptation function (0.8, 1.6) provides an attractive 
combination of responsiveness and asymptotic performance for B = 2. 

2.3 Multiplier Functions for B = 3, C = 

Table III demonstrates the nature of the SNR function (9) for 
5 = 3 and a specific multiplier function. Table IV uses the performance 
criterion (11) to show the efficiency of this multiplier function (0.9, 
0.9, 1.25, 1.75). As in the 2-bit example, the M-i unctions in Table IV 
are only a subset of a much larger set of M -functions which were 
simulated and compared on the basis of SNRave- We have only 
included the most interesting functions from our search for maximum 
SNRave- The first three M -functions in Table IV satisfy a stability 
constraint analogous to (11) : 



M^Ml^m^Ml 07 = Mp-Mp-Mp-Mp = 1. 



(13) 



It is interesting that the best function in Table IV belongs to the class 
of functions obeying (13). Notice also that the reduction of the number 
of distinct step-size multipliers (second row in Table IV) leads to a 

Table IV — Comparison of Multiplier Functions (B — 3, C = 0) 



Mi Mi 


M 8 


M 4 


SNRave (dB) 


0.90 0.90 
0.90 1.00 


1.25 

1.00 


1.75 
1.75 


11.7 
11.4 


0.5 1.0 
0.3 0.9 
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2.0 
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9.6 
8.9 
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Fig. 5— Histogram of slot occupancies (B = 3, C = 0, N = 10,000). 



marginal decrease of SNRave- This tolerance to a reduction of the 
number of distinct multipliers does seem to extend, although in lesser 
measure, to larger values of B and to speech and picture signals. 

Finally, Fig. 5 shows a histogram of slot occupancies for the best 
M-function in Table IV. The number of quantizer slots or output 
levels is equal to four (neglecting signs), and the dotted fifth slot 
refers to the overload probability that has been accumulated into the 
fourth bar of the histogram. It is interesting that, despite step-size 
adaptations, the Gaussian nature of the input density function shows 
up in the histogram. The heights of the bars in Fig. 5 represent experi- 
mental slot probabilities of 0.47, 0.30, 0.14, and 0.09. Notice again 
that, in the manner of (13) : 



0.9 - 47 -0.9°- 30 -1.25 014 -1.75 009 = 0.994 ^ 1. 



(14) 



2.4 Comparison of Adaptive and Nonadaptive Quantizers 

Table V summarizes the nature of optimal multiplier functions for 
B = 2 and 3. These functions are obtained on the basis of criterion 
(11). Values of M are generally rounded, representing broad optima, 
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Table V — Quantization of Gauss-Markov Inputs [Entries are 
SNR (10,000, Aopt) values in dB] 



B 


C 


0.00 


0.50 


0.99 




SNRna 


9 


9 


9 


2 


SNRa 


7 


8 


11 


M(l) 


0.8 


0.8 


0.5 




M(2) 


1.6 


1.6 


2.0 




SNRna 


14 


14 


14 




SNRa 


13 


13 


16 




M(l) 


0.90 


0.90 


0.30 


3 


M(2) 


0.90 


0.90 


0.90 




3/(3) 


1.25 


1.25 


1.50 




Af(4) 


1.75 


1.75 


2.10 



and the precision in the specification of M values may be as bad as 
±5 percent in some cases. 

To provide a fair comparison with optimal nonadaptive quantizers, 
the performance figure used in Table V is the asymptotic value (10). 
Formally, the notation used in the table is as follows : 



SNR = SNR(10,000, A 0PT ). 



(15) 



The subscript A refers to the adaptive quantizer with step-size multi- 
pliers optimized using (11), while the subscript NA refers to a non- 
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Fig. 6 — Conditional density function of quantizer input. 
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adaptive quantizer with constant step-size A pt- The SNR values are 
in dB, and are rounded to the nearest integer. 

Note that negative values of C are not included in the table. The 
assumption of a symmetrical quantizer (Fig. 1) renders the quantizer 
design independent of the sign of C. Specifically, the quantizer input 
X T has a probability density function (conditioned to X T ~\) that is 
sketched in Fig. 6; the optimum step-size is that which fits the 
quantizer to this density function in a way that minimizes the sum of 
overload error variance and granular error power. This optimum 
depends only on the disposition of the PDF in Fig. 6 and the magnitude 
of the nonzero mean, not the sign of it. 

Finally, Table V assumes that no constraints exist on the minimum 
and maximum values of step-size. Practical implementations will, of 
course, involve such constraints (see Fig. 8), as well as constraints on 
actual multiplier values. Significant conclusions from Table V are the 
following : 

(i) Except for C = 0.99, optimal multipliers are such that step-size 
decreases are always slower than step-size increases. The observation 
has been found to extend for B = 4 also and, as seen later (Section IV), 
to the quantization of speech and picture signals as well. 

The need for fast increases of step-size and slow decreases thereof 
may be physically explained as follows. Quantization errors during 
overload tend to be more harmful than those during granularity, in 
that the magnitude of granular error is restricted, by definition, to a 
half step-size, while no such simple constraint exists for an overload 
error. It is therefore reasonable to decrease step-sizes (relatively) 
slowly to avoid unduly small step-sizes leading to the harmful overload 
errors. The observation is obviously less significant for a coarser 
quantizer than for a finer quantizer because granular errors in the 
former are more comparable in magnitude to overload errors and 
hence more equally harmful. This is indeed reflected in Table V. Note 
that, for a given value of C, the disparity in rates of step-size increases 
and step-size decreases is least for the coarser quantizer (B = 2). 

There is an alternative explanation for (i) above, which also clarifies 
why the disparity between the speeds of step-size increase and step- 
size decrease is less apparent for large values of C. Refer to the stability 
constraints (12) and (13), as discussed for the case of C = 0. It turns 
out that in the uniform (nonadaptive) quantization of a Gaussian 
signal, the probability P s (8) is a monotonically decreasing function of 
s. It follows then, as seen in (12) and (13), that multipliers for step- 
size decreases have greater probabilities of being employed, and hence 
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must lead to slower step-size changes each time they are actually used. 
Explicitly, for B = 2, (12) can be rewritten: 



tnM* 



ln{l/Mx) 



Pi 
P 2 ' 



(16) 



Obviously, then, if Pi > Pi, the step-size increase (as given by M%) 
is faster than the step-size decrease (as given by Mi). 

The argument for nonzero values of C is very similar, except that 
the probabilities P, peculiar to a Gaussian probability density function 
should now be replaced by probabilities P»{C) that refer to the uniform 
quantization of the asymmetrical conditional PDF in Fig. 6. Ap- 
parently, the probabilities P a (C) are not monotonically decreasing for 
C = 0.99. This is why the requirement of relatively more rapid step- 
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Fig. 7— Histogram of slot occupancies (B = 3, C = 0.99, N = 10,000). 
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size increases is waived for the example of C = 0.99 (while being still 
true for C = 0.5). 

Figure 7 shows a histogram of slot-occupancies for B = 3 and 
C = 0.99. Compare this non-monotonic PDF with the Gaussian histo- 
gram for C = (Fig. 5). In analogy with (13), the stability criterion 
associated with Fig. 7 is 

0.3°- 07 0.9 - 63 1.5°- 21 2.1 009 = 0.993 £* 1. (17) 

Finally, it should be mentioned that the (relatively) slow step-size 
decreases in Table V are fast enough, in an absolute sense, for typical 
quantizer applications. For example, if M = 0.95, and a step-size 
decrease of 20 dB is needed for adaptation to an idle-channel situation 
in speech quantization, the time needed for such adaptation will be 
45 samples. For Nyquist-sampled speech, this is only about 5 ms. 

(u) Although the quantization problem for C = 0.5 is qualitatively 
similar to that for C = 0.99 (Fig. 6), we note that results for C = 0.5 
(Table V) are nearly identical with those for C = 0. The differences in 
AfopT values that are caused by a nonzero C = 0.5 were apparently 
too small to be detected in our finite search for best multipliers. 

(Hi) Referring again to Table V, the best adaptive quantizers seem 
to have an SNR advantage over the nonadaptive scheme (working 
with an optimal step-size) only for very highly correlated inputs. In 
fact, in many instances, the SNR gain resulting from adaptation is 
seen to be negative (due, evidently, to overly abrupt manipulations of 
step-size). 

The reason for using an adaptive quantizer in these situations is 
only to facilitate quantizations with much less knowledge of the input — 
equivalently, with much less knowledge of Aopt than is necessary for 
an equivalent performance in the nonadaptive case. In other words, 
step-size adaptions increase the dynamic range of the quantizer and 
enable it to handle inputs with large amplitude variations, such as 
nonstationary signals. 

The above idea has already been demonstrated by the asymptotic 
SNR values in Tables I and III. To provide a more application- 
oriented illustration, we undertook two extensions of our computer 
simulation. These experiments employed B = 4, C = 0.5, and the fol- 
lowing multiplier function : 

(0.90, 0.90, 0.95, 1.0, 1.2, 1.5, 1.8, 2.1). (18) 

Finite step-size dictionaries were used, determined by maximum and 
minimum step-sizes A max and A M in- The starting step-size was set 
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equal to Aopt, subject, however, to modification because of the con- 
straints Amax and A M in- 

In the first of these extensions, the step-size dictionary had the 
characterization 



MAX" Am IN 

max/Amin 



= AS, 
= R 



(19) 
(20) 



and the quantizer performance was studied in terms of SNR (10,000, 
Aopt) as a function of R. It was reassuring to note that the SNR was 
constant to within 1 dB for sample values of R in the range 1 to 
oo — due, no doubt, to the safe design feature (19). In fact, a maximum 
SNR was noted for a noninfinite value of R. 

In a more revealing second experiment, the quantizer was "centered" 
at a value Amid not necessarily equal to Aopt: 

AmaxAmin = A M id (21) 

and the performance was measured as a function of Amid for values of 
72(20) equal to 1, 10, and 100. Note that R = 1 refers to the non- 
adaptive case. 

Figure 8 plots these results. The monotonic improvement of dynamic 
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Fig. 8 — Companding characteristics (B = 4, C = 0.5). 
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range with increasing R is apparent. It is expected 1 that practical 
quantizers can be designed with values of R equal to 100 or more. 

III. THEORETICAL DERIVATION OF OPTIMAL MULTIPLIERS 

In this section, we shall regard the adaptive quantization problem 
as one of learning signal variance. In other words, the problem of 
determining an optimum instantaneous step-size A r is regarded as 
being tantamount to that of finding the best estimate at time r of 
the conditional standard deviation S r of the quantizer input; and of 
setting A r proportional to this estimate: 

A? PT = K{B)-3 r (Ar-l,Pr-l)- (22) 

3.1 Case of C = 

The constant K is an obvious function of the number of quantizer 
levels and, hence, of B. For the problem of uniform quantization of a 
zero-mean Gaussian signal, Max's Table II 10 specifies the following 
values for K(B) :* 

K(l) - 1.596, K{2) = 0.996, 
K(3) = 0.586, K(4) = 0.335. ( ' 

The dependence of S r on A r _i and P r -i (22) is, of course, characteristic 
of an adaptation strategy which uses a 1-word memory. 

We now propose that the variance of X r be estimated as the average 
of the squares of (i) X r -i, the most recent quantizer input, and (ii) 
S r -i, the most recent estimate of S. In other words, let 

S 2 r = *(*?_! + #_:). t (24) 

We next recall the identity 

X r _ t = 7 r _ x - E T .t = Pr -^ r - 1 - E T _ h (25) 

where E T -\ is the quantization error. Furthermore, by virtue of the 
basic algorithm (22), we suggest that 

S 2 _, = A r 2 _,/K 2 (26) 

Let us use (25) and (26) in (24) and set the resulting value of S T in 
(22). We obtain, after some algebra: 

P 2 r-1 , 1 1 



K 2 



I +K* + ~tfr ( ^- J _ E r~^r-lPr-l) 



(27) 



* These K values are relevant for C = because, in this case, the conditional 
density function (Fig. 6) is indeed zero-mean Gaussian. 

' In general, one may consider a weighted average of the type ux 2 + vs 2 . The 
case of u = will be appropriate for "steady-state" operation, and the use of v = 
will be appropriate for a "transient" situation. The need for time-invariant step-size 
multipliers suggests a compromise design characterized by a weighting of the type 
u = v = 0.5. 
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E r -i is an unknown random variable, but the following can be said 
about its role in (27) : 

First, the Ef-i term is significant only for the last quantizer slot in 
which, due to possible overload, E?- t can be arbitrarily large. Further- 
more, for this end slot the -£ r _iA r _iP r -i term tends to be positive. 
Notice, from definition (25), E is negative in overload when P is 
positive and vice versa. 

For the remaining quantizer slots, Ef-i is again positive but no 
longer significant, and -^ r _iA r _iP r -i is expected to be negligible as 
well, on the average. This is by virtue of the uniform PDF approxi- 
mation for granular errors 

P{E) = 1/A; - | < E er&n < | (28) 

and a consequent decorrelation of output PA and error E. 

The optimum multiplier function [square root of (27) ] can therefore 
be expressed in the form 

M?FT = [^ + ^ 2p r- 1 ] i + 5 2 (l^r-i|); G = 0; (29) 

where 5 2 is a positive correction term that is significant only for the 
end slot : 

5 2 (|P r _i|)^0 if |P r _ 1 |^2 B -l. (30) 

Table VI compares the M values from (29) with those from the 

simulation in Section II. 

3.2 Comparison With Stroh's Adaptation Logic (C = 0) 

Consider, in place of (24), a simpler variance estimation of the 

type considered by Stroh : 2 

S 2 r = X r 2 _x. (31) 

This results in, by virtue of (25), (22), and arguments similar to those 
at the end of the previous paragraph, a multiplier function of the form 

M r = f |P r _i| + 5 2 (|P r _ 1 |); 

2 (32) 

&(\Pr-l\)9*0 if |Pr-l| 9*2* -1. 

Table VI lists values of M?* T (29), M T (32), and the experimental 

optima Mexp from Table V. Values of K have been taken from (23). 

Notice how M? PT provides a better specification of optimal multi- 
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Table VI — Comparison of Multiplier Functions 



B = 2 




B = 3 


M T 


M?™ 


Mexp 


|Pr-l| 


Mr 


M? PT 


■Mexp 


0.50 
1.50 + 5 2 


0.79 
1.27 + 5* 


0.80 
1.60 


1 

2 
3 
4 


0.29 
0.87 
1.45 
2.00 + 8* 


0.75 
0.94 
1.26 
1.61 + 5 2 


0.90 
0.90 
1.25 
1.75 



pliers than does M r . Furthermore, as B increases, the constant K 
approaches zero and the theoretical multiplier functions for the inner- 
most slot (P r -i = ±1) have the following limiting behavior: 



Urn M r {l) = 

2? -co 



lim M° PT (1) = 

2?-* oo 



(1/2 = 0.71. 



(33) 



(34) 



Simulations with B = 4 and 5 have verified that the trend in (34) 
is indeed more realistic than that in (33). 

It should be mentioned that the adaptation strategy (31) is only 
the simplest case of Stroh's 2 method which has a general variance 
estimator of the form 



^r,n — ~ 2-, -X-t—w 
li M=»l 



(35) 



It is interesting, nevertheless, that for the same length (n = 1 or 
one-word) of quantizer memory, our adaptation rule specifies better 
step-size multipliers, as seen in Table V. In fact, the use of M? PT 
yields for (B = 3, C = 0), an SNR (for Gaussian signals) which is 
better than what Stroh reports for n = 2 (10 dB; N = 2500) in his 
Fig. 3.3. With the experimentally optimized M-i unction (Table V), we 
indeed do significantly better and the SNR value of 12.7 dB for this 
case is equivalent to n = 6 in Stroh's logic and falls short of the 
optimum (n = °o in Stroh) by not much more than 1 dB. 

The efficiency of our logic is clearly attributable to the way we 
exploit quantizer memory, namely, in terms of P and A, rather than 
in terms of the product of the two quantities (the quantizer output Y 
used by Stroh). Physically, the use of PA for adaptation seems to 
wipe out some of the "overload" and "underload" cues that an indi- 
vidual knowledge of P and A preserves. 
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Table VII — Comparison of Theoretical (M° pt ) and Experi- 
mental (Mf xp , in Parentheses) Multipliers 



B C 0.99 

T~~ A/(l) 0.79(0.8) 0.5(0.5) 

2 A/ (2) Q1.27 + 5»](1.6) [1.5 + 5'] (2.0) 

A/(l) 0.75(0.9) 0.25(0.3) 

M (2) 0.94(0.9) 0.75(0.9) 

6 A/(3) 1.26(1.25) 1.25(1.5) 

M (4) [1.61 +5*] (1.75) [1.75 +5'] (2.1) 



3.3 Case of C — > 1 

When the adjacent signal correlation C approaches unity, the con- 
ditional PDF (probability density function) of X T approaches a 
Gaussian spike centered at CX r -i (Fig. 6). The width of the spike 
is proportional to the square root of (1 — C 2 ), and therefore approaches 
zero irrespective of the value of signal variance S. The adaptive 
quantization problem is no longer one of variance estimation. It will 
consist, instead, in a "fool-proof" strategy of the following type: 
Select a step-size A r such that the PDF spike at CX?-i falls right in 
the middle of the positive (or negative) half of the quantizer range, 
assuming that CX r -\ is positive (or negative). If we recall that a 
B-bit quantizer has a half-range width equal to 2 B_1 A, we see the 
requirement (assuming positive quantities throughout) is : 

CX r _x = ^- r ; C->1. (36) 

The logic clearly provides simultaneous protection against both over- 
load and underload. Utilizing the estimate of X r -i (25) in (36), we 
obtain the condition 



C 



T Pr-lA,-! _ E ^ "I = ^^^ . (J ^ l (3?) 

Equivalently, with usual assumptions on the quantization error i? r _i, 

1 h M?PT = A^ = i ^ + 52(|P - l|) (38) 

5 2 (|Pr-i|)^0 if P r _i^2 s - 1. 



See the spike in the histogram of Fig. 7. 
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3.4 Comparison with Simulation Results 

Results for a general value of C(0 < C < 1) can in principle be 
attempted on the basis of a general PDF such as Fig. 6. However, 
tractable derivations seem to require too many simplifying assumptions 
to make the theory worthwhile, especially in view of the observation 
(Table V) that the correlation becomes significant only if C — > 1. We 
therefore conclude this section by merely listing, in Table VII, theo- 
retical step-size multipliers for C = and 0.99 [from (29) and (38)] 
together with the experimentally optimized multipliers from Section II. 

IV. QUANTIZER SIMULATIONS WITH SPEECH AND PICTURE SIGNALS 

In this section, we present results from computer simulations of the 
adaptive quantizer with speech and picture inputs. 

The results in Table VIII refer to a low-pass-flltered speech signal 
(about a second long), and a single frame of picture input (the face of 
Karen in Picture-phone® format 6 ). Listed are step-size multipliers found, 
by search procedure, to maximize an asymptotic SNR (10) as measured 
over the entire length (N >>> 10,000) of the input sequences. 
The following observations are of interest : 

(i) The signal PDF seems to have a significant effect (presum- 
ably through overload statistics and the end-slot correction 
5 2 (|P r _i|) of Section III) on the largest step-size multiplier. 
Note the value of M 4 for picture input. 



Table VIII — Step-Size Multipliers for Illustrative Speech and 
Picture Signals (Entries in parentheses refer to pictures) 



fi\Coder 
\Type 


PCM 


DPCM 


2 


0.6, 2.2 


0.8, 1.6 


3 


0.85, 1, 1, 1.5 
(0.9, 0.95, 1.5, 2.5) 


0.9, 0.9, 1.25, 1.75 
(0.9, 0.95, 1.5, 2.75) 


4 


0.8, 0.8, 0.8, 0.8, 
1.2, 1.6, 2.0, 2.4 


0.9, 0.9, 0.9, 0.9, 
1.2, 1.6, 2.0, 2.4 


5 


0.85, 0.85, 0.85, 0.85, 
0.85, 0.85, 0.85, 0.85, 
1.2, 1.4, 1.6, 1.8, 
2.0, 2.2, 2.4, 2.6 


0.9, 0.9, 0.9, 0.9, 
0.95, 0.95, 0.95, 0.95, 
1.2, 1.5, 1.8, 2.1, 
2.4, 2.7, 3.0, 3.3 
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3i- 




2B-1 



Fig. 9 — Desirable form of the multiplier function M for the adaptive quantization 
of speech signals (B > 2) and first^order Gauss-Markov signals which are not highly 
correlated (say, C > 0.5). 

(it) Differentiation has the effect of decreasing adjacent sample 
correlation. This seems to explain differences in multipliers as 
applied to PCM and differential PCM quantizers for speech. 
Note that the effect is most pronounced for B = 2. 
(in) Although the input signals are not first-order Markovian, the 
multipliers have the earlier-mentioned property that step-size 
increases are relatively more rapid than step-size decreases. 
Refer to the general diagram in Fig. 9. 
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Table IX — Comparison of Speech Quantizers 
(Entries are SNR values in dB) 



B 



Logarithmic 
PCM with 

iU-law 
Quantization 



Adaptive 

PCM with 

Uniform 

Quantization 



Adaptive 
DPCM with 

Uniform 
Quantization 



Adaptive 
DPCM with 
Nonuniform 
Quantization 



15 



9 
15 

19 



13 

18 
22 



12 

18 
24 



It may be mentioned that in each of the above simulations, the 
adaptive techniques also registered an SNR gain of 2 to 4 dB over 
optimized nonadaptive quantizers. Table IX shows some results per- 
taining to a band-pass-filtered speech sample. These results were 
obtained from an independent experiment on coder assessment. 12 The 
adaptive quantizers (APCM, ADPCM) used the multipliers of Table 
VIII*, and the nonuniform quantizer characteristics employed in 
adaptive DPCM are those recommended by Paez and Glisson. 13 
Finally, the log-PCM used a m = 100, 7 and the adaptive quantizers used 
a maximum-to-minimum-step-size ratio of 100. 

Notice from the table that adaptive quantization, as incorporated 
into PCM, has the potential of outperforming the conventional 
technique of logarithmic companding. Evidently the advantages over 
log-PCM are even more impressive in ADPCM, and a companion 
paper will discuss, at length, the use of 3-bit and 4-bit adaptive 
quantizers in the DPCM coding of speech. 1 
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